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Abstract 



Information about current assessment practices was obtained from 
54 surveys completed by Handicapped Children's Early Education Program 
demonstration projects across the United States. Information about 
factors influencing the selection and continued use of tests also was 
provided. Results indicated that 19 tests were used by five or more 
programs and only one device was used by over half of the responding 
programs. Although most tests were listed as being used for more than 
one purpose, some tests appeared to be used more exclusively than 
others for a particular purpose. The technical adequacy of tests, in 
terms of norms » validity and reliability, was reportedly an important 
factor influencing selection and continued use. However, analysis of 
the 19 most commonly used devices revealed that only three were 
technically adequate. Other methods of assessment also were examined. 
Implications for model practice are discussed. 
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Assessment Practices in Model Early Childhood EducatiOii Programs 
Camilla A. Lehr, James E. Ysseldyke, and Martha L. Thurlow 

The movement toward early childhood education is generally 
recognized as emergi ng i n the 1960s (Lichtenstei n & Ireton , 1984 ; 
Osborn, 1975). With the passage of the Economic Opportunity Act in 
1964, extensive funding was provided for educational programs for 
preschool children (Ysseldyke & Algozzine, 1984). In 1968, Public Law 
90-538 (Handicapped Children's Early Education Assistance Act) was 
passed, thus providing further support for the education of preschool 
handicapped children. Numerous experimental projects aimed at 
providing enrichment experiences and educational opportunities emerged 
in an attempt to demonstrate the beneficial effects of early 
education. The interest in early childhood education and assessment 
in the 1960s consequently spurred the development of many new 
marketable tests. A review of contemporary preschool asssessment 
instruments indicated that well over 200 tests were constructed and 
published in accordance with the Headstart movement (Dykes, 
Strickland, & Munyer, 1979). 

Unfortunately, many of the early childhood tests that were 
developed were of poor quality. In 1971, the Center for the Study of 
Evaluation and the Early Childhood Research Center of the UCLA 
Graduate School of Education published a comprehensive guide of over 
120 preschool and kindergarten tests (Hoepfner, Stern, & Nummedal , 
1971). Of the 120 tests yielding 630 subtests, only 7 subtests were 
rated as providing good measurement validity. Most normative 
evaluations were either poor or fair. 
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Scrutiny of a test's technical characteristics is imperative 
because of the decisions that result from test scores. Guideline! 'or 
test construction and use, including standards for norms, validity and 
reliability, have been developed and outlined in the St andards for 
Educational and Psychological Tests (Arnerican Psychological 
Association, 1985). Unfortunately, the manuals of many tests lack 
sufficient information to justify their use for making decisions 
regarding young children. When Thurlow and Ysseldyke (1979) evaluated 
the validity, reliability and norms of the most frequently used tests 
in federally funded Child Service Demonstration Centers (CSDC), only 7 
of the 28 tests were considered technically adequate in all three 
aspects. 

It is evident that technical inadequacy of standardized tests for 
young children is a critical issue in assessment and educational 
decision making today. One of the major purposes of research 
investigating technical adequacy is to provide test consumers with 
information that enables them to choose and use a test in a more 
judicious and appropriate manner (Mardel 1-Czudnowski & Lessen, 1982). 
Evidence about the technical adequacy of tests can help to prevent 
selection of inappropriate or worthless measures. 

Handicapped Children's Early Education Program 

This study examined current assessment practices used in model 
programs on a national 1 evel . The Handi capped Chil dren ' s Early 
Education Program's (HCEEP) demonstration projects were selected as 
the best source from which to gather this information because of the 
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national recognition they have received as model programs and the 
contributions they have made to the field of early education. 

HCEEP was established in 1968 in an effort to provide major 
services and intervention to handicapped children at an early age 
(TADS, 1984). Its purpose is to "support experimental preschool and 
early childhood programs that show promf se of promoting a 
comprehensive approach" to the problems of handicapped children and 
their families. The program was initiated by Congress with the 
passage of Public Law 90-538, the Handicapped Children's Early 
Education Assistance Act, and is supported by grants and contracts 
from the Office of Special Education Programs (OSEP) of the U.S. 
Department of Education. 

A major part cf the HCEEP (and the subject of this study) is its 
demonstration component. In 1983-84, the demonstration component 
consisted of 82 projects that developed and implemented innovative 
models of early intervention and education of young handicapped 
chil dren. The model demonstration programs are composed of several 
features, including child identification and assessment, 
educational/therapeutic programming for children, evaluation of child 
progress, active parent and family participation, inservice training, 
coordination with public schools and other agencies, evaluation of 
project objectives, and demonstration and dissemination of project 
information (TADS, 1984). A recent report on an analysis of the 
impact of the demonstration and outreach components described their 
accomplishments as being "greater and more varied than for any other 
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documented education program identified" (Roy Littlejohn Associates, 
1982). 

Purpose 

The purpose of this study was to determine what early childhood 
assessment instruments and other methods of assessment are being used 
in national model programs serving young handicapped children. 
Factors that influence the selection of tests and continued use of 
tests were also investigated. After gathering information about the 
most commonly listed assessment devices, the technical adequacy of 
each test was analyzed according to the purpose for which it was used. 

Method 

Subjects 

The subjects of this study were 54 HCEEP demonstration projects 
located across the United States. The subjects were obtained from a 
pool of 82 demonstration projects in existence in 1983-84, as listed 
in the 1983-84 Handicapped Children's Early Education Program 
Directory. The Directory provides descriptive information about the 
HCEEP and was produced for the Office of Special Education Programs, 
U.S. Department of Education, by TADS in 1984. The 54 subjects 
represent those projects returning the survey sent to all 
demonstration projects; the number reflects a response rate of 65.9%. 

Materials 

A survey was developed to investigate issues related to current 
assessment practices used in the demonstration projects. 
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Specifically, personnel were asked to provide information about 
demographic characteristics of the project and factors influencing the 
selection and continued use of tests. The last section of the survey 
requested a list of actual tests, as well as other informal methods of 
assessment, used for five assessment purposes. The five purposes for 
administering tests, as developed by Salvia and Ysseldyke (1985), are 
defined as follows: (a) Screening: to identify students who are 
sufficiently different from others similar in age that they require 
special attention or assessment, (b) Classification/Placement: to 
identify students who are eligible for special education services, (c) 
Instructional Planning: to assist staff in planning educational 
programs (deciding what to teach and how to teach) for individuals, 
(d) Pupil Evaluation: to monitor individual progress, and (e) Program 
Evaluation: to evaluate the effectiveness of the educational program. 
A copy of the survey is included in Appendix A. 

Procedure 

The surveys were sent by mail to all 82 HCEEP demonstration 
projects funded during 1983-84, with cover letters and stamped 
envelopes enclosed for their return. A follow-up postcard was mailed 
approximately six weeks later to those centers not responding to the 
initial mailing. 

Results 

Project Informatio n 

Fifty-one demonstration projects provided information about the 
number of children served during 1983-84. Three of these projects 
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reported serving over 200 children each during 1983-84. These data 
were significantly discrepant from information reported in the 1983-84 
edition of the HCEEP directory and were not used to calculate the 
number of children served. Compilation of data from the remaining 48 
surveys indicated that the total number of children served during 
1983-84 was 1,621. Within this sample the number of children served 
varied considerably among programs, ranging from 8 to 120 (X = 33.8, 
SD = 24.3). 

For 53 projects, the average number of years of funding was 3.62 
and ranged from 2 to 16 years. The age of children served ranged from 
prenatal care to 6 years. Thirty-eight programs (70.4%) provided 
services to children beginning at or before birth and 17 programs 
(31.5^) served children until the age of 6. The average ages of 
children served ranged from .80 for the youngest to 4.53. Age ranges 
appropriately reflected the early childhood population that the 
demonstration projects aim to serve. 

Information provided about characteristics of the target 
population served was very general. The majority of respondents 
(53.7%) described the target population as having "various mild to 
moderate handicaps" without further specification. Twenty-seven 
percent described the children they served as being "at-risk." Only 
14.8 percent described specific handicapping conditions of the target 
population served. These exclusive definitions included children who 
were specifically referred to as (a) language impaired (5.5%), (b) 
hearing impaired (3.7%), (c) behaviorally disordered (1.9%), (d) 



severely/profoundly retarded (1.95^), and (e) visually impaired (1.9%). 
Because of the small number of children with specific handicapping 
conditions, tests were not analyzed according to the population 
served. 

Factors Influencing Selection and Continued Use of Tests 

Respondents were asked to select two factors (from a list of 10) 
that influenced the selection of tests used in the demonstration 
program. Table 1 is a summary of the factors infl uencing the 
selection of tests that were checked by respondents from the 
demonstration projects. Responses indicated that the most common 
factor influencing the selection of tests was whether the test was 
"recommended by other professionals" (64.8%). This was followed by 
"technical considerations (norms, reliability, and validity)" (61.15K). 
Twenty-two percent of the respondents indicated that availability or 
access to the test was an important factor. Approximately 18% 
selected inservice training workshops as influencing their selection 
of tests. Responses suggested that use of Tests in Print or Buros' 
Mental Measurements Yearbook , textbooks or journal articles, as well 
GS publishers' catalogs and the cost of the tests^ were relatively 
unimportant factors influencing test selection. Twenty-nine percent 
of the respondents checked the category "other" as influencing the 
selection of tests. Examination of the listed responses indicated 
several other factors that influenced test selection, including: (a) 
professional experience and expertise, (b) use of Educational 
Resources Information Center (ERIC), (c) graduate training with a 
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Table 1 

Factors Influencing the Selection of 
Tests Used in HCEEP Demonstraton Projects 



Factor 


Percentage of Programs 
tnecKing Factor 


Recommended by Other Professionals 


64.8 


Technical Considerations (Norms , 
Reliability, Validity) 


61.1 


Availability or Access to Test 


22.2 


Inservice Training Workshops 


18.5 


Publ isher ' s Cataloa 


b. b 


Textbook 


3.7 


Use of Tests in Print or Mental 
Measurements Yearbook (Buros) 


1.8 


Cost 


1.8 


Journal 


0.0 


Other 


29.6 



ERIC 
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particular instrument, (d) tnandated use by county or state, (d) 
familiarity, and (f) compatibiVity with program objectives. 

After a test has been selected, it is either discarded or 
continues to be used for various reasons. Respondents were asked to 
rate nine factors, in the order of their importance^ influencing the 
continued use of tests. Forty-one respondents rated the factors as 
directed. Ratings were added together for each factor and averaged. 
"Information gathered from te&t results" received the highest average 
rating. The second most important factor was an "appropriate norming 
population." This was closely followed by the test's validity 
(third), reliability (fourth), ease of administration (fifth), and 
professional recommendations (sixth)* Cf lesser importance were the 
following factors: (a) common use of the v=^st in the past (seventh), 
(b) favorable description by the test market (eighth) and (c) cost. 

Assessment Device? 

Fifty-two HCEEP demonstration projects provided information about 
the specific tests used to assess children and the purposes for which 
each test was used: The number of devices listed by each program 
varied, ranging from 1 to 16 (X = 6.2, SD = 3.8). A total of 109 
tests was listed by F;2 programs. Nineteen tests were used by five or 
more programs, (Seven projects listed unpublished project-developed 
tests, but each test was different,) No single test was used by every 
program. Only one test was used by over half of the responding 
programs — Bayley Scales of Infant Development (52. 8X). The specific 
devices used by five or more programs and the purposes for which they 
were used are summarized in Table 2. 
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Table 2 



Percentages of HCEEP Model Demonstration Projects 
Using Assessment Devices for Different Purposes 



Device' 



% of Programs 
Using Device'^ 



Al pern-Boll 15.1 

Bayley 52.8 

Brigance 20.8 

Denver 30.2 

E-LAP 13.2 

HELP 13.2 

K-ABC 11.3 

LAP 15.1 

Leiter 9.4 

McCarthy 18.9 

Portage Guide 9.4 

PPVT-R 13.2 

REEL 9.4 

SICD 20.8 f 

Stanford-Binet 18.9 

Uzgiris-Hunt 13.2 

UPAS 9.4 

Vineland 17.0 

Zimmerman PLS 13.2 
Project-Developed 13.2 



Purpose for Which Used 



Screening 



Placement 



Instructional 
Programming 



Pupil 
Evaluation 



12.5 
10.7 
0.0 
93.8 
14.3 
0.0 
0.0 
0.0 
0.0 
10.0 
0.0 
14.3 
20.0 
0.0 
10.0 
0.0 
0.0 
0.0 
28.6 
57.1 



50.0 
71.4 
36.4 

6.3 
28.6 
14.3 
100.0 
50.0 
80.0 
90.0 

0.0 
85.7 
60.0 
72.7 
70.0 

0.0 
20.0 
66.7 
71.4 
0.0 



37.5 
14.3 
90.9 
0.0 
71.4 
100.0 
33.3 
75.0 
0.0 
20.0 
100.0 
28.6 
0.0 
45.4 
0.0 
100.0 
100.0 
0.0 
57.1 
42.9 



50.0 
25.0 
63.6 
0.0 
71.4 
57.1 
50.0 
50.0 
20.0 
20.0 
40.0 
28.6 
40.0 
45.4 
30.0 
71.4 
60.0 
44.4 
42.9 
28.6 



Program 
Evaluation 



62.5 
42.9 
18.2 
0.0 
28.6 
57.1 
50". 0 
50.0 
20.0 
30.0 
20.0 
28.6 
40.0 
27.3 
30.0 
28.6 
80.0 
11.1 
14.3 
28.6 



JFull names of tests are provided in Appendix B. 

^Percentages reflect numbers of HCEEP projects mentioning assessment device. 
^Percentages reflect numbers of HCEEP projects using the device for each purpose based 
only on those listing the method. 
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It is apparent from Table 2 that programs using a particular test 
may have listed it for several different purposes. A summary of tests 
that were used for a particular purpose by 30% or more of the programs 
using it is provided in Table 3. Ninety-three percent of the programs 
that listed the Denver Developmental Screening Test listed it for the 
purpose of making screening decisions. No single test was used by 80% 
or more of the programs for the purpose of pupil evaluation. Of the 
programs listing the use of the Uniform Performance Assessment System 
(N = 5), 100% listed it for the purpose of instructional programming, 
and 80% (N = 4) listed it for the purpose of program evaluation. 
Other tests that were listed for instructional programming by at least 
80% of the programs using the tests were: (a) Uzgiris-Hunt Ordinal 
Scales of Development (100%), (b) Hawaii Early Learning Profile 
(100%), and (c) Brigance Inventory of Early Development (90%). Tests 
that were used for making placement decisions (listed by at least 80% 
of the programs based only the number of programs listing the 
particular test) were: (a) Kaufman Assessment Battery for Children 
(100%), (b) McCarthy Scales of Children's Abilities (90%), and (c) 
Peabody Picture Vocabulary Test-Revised (86%). 

The data on the purposes for which each assessment device is used 
reveal that nearly all devices are used for multiple purposes. All of 
the 19 tests listed by fiva or more programs were used for at least 
three purposes. However, many tests are used for some purposes more 
than for others. Tests that were listed five or more times within a 
particular category of purpose are listed in Table 4. The percentages 
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12 Table 3 



Percentage of Programs Listing Tests for Each Purpose 
Based Only on the Number of Programs Citing the Device^ 



Screening 


Classification/ 
Placement 


Instructional 
PI anning 


Pupil 
Evaluation 


Program 
Evaluation 


Denver 93.8 


K-ABC 100.0 
McCarthy 90.0 
PPVT-R 85.7 




HELP 100.0 

Portage 100.0 
guide 

UPAS 100.0 

Uzgiris- 100.0 
Hunt 

Brigance 90.9 




UPAS 80.0 



Table includes only those devices that were cited by 80% or more of the programs for a 
particular purpose based only on the programs citing the device. The device had to be 
cited by at least five programs to be included. 



Table 4 



Percentage of Programs Listing Tests for Each Purpose Based on the Total Number of Programs^ 



Screening 


Classification/ 
Placement 


Instructional 
PI anning 


Pupil 
Evaluation 


Program 
Evaluation 


Denver 28.3 


Bay! ey 


37.7 


Brigance 


18.9 


Bay! ey 


13.2 


Bayley 22.6 




McCarthy 


17.0 


Uzgiris- 


13.2 


Brigance 


13.2 










Hunt 












SICD 


15.1 






E-LAP 


9.4 










HELP 


13.2 










Stanford- 


13.2 






Uzgiris- 


9.4 






Binet 




LAP 


9.4 


Hunt 








K-ABC 


11.3 


Portage 




SICD 


9.4 










gu i de 


9.4 










PPVT-R 


11.3 
















SICD 


9.4 










Vineland 


11.3 


















UPAS 


9.4 










Zimmerman 


9.4 














PLS 















^Table includes only those devices mentioned by five or more programs. Percentages are 
based on the total number of programs. 
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listed in this table reflect the number of programs using a particular 
device out of all programs listing tests. 

For screening, the most commonly used test, listed by 28.3% of 
the programs, was the Denver Developmental Screening Test. For the 
purpose of making placement decisions, eight tests were listed by five 
or more programs, with the most commonly used test being the Bayley 
Scales of Infant Development (37.7%). For the purpose of 
instructional programming, seven tests were listed by five or more 
programs. The most commonly listed test for instructional programming 
was the Brigance Inventory of Early Development (18.9%). For pupil 
evaluation, five tests were listed by five or more programs, with the 
most commonly used tests being the Bayley (13.2%) and the Brigance 
(13.2%). Last, for program evaluation, the most commonly listed test 
was the Bayley Scales of Infant Development (22.6%). 

Respondents also were asked to list, according to the five 
categories of purposes, other methods that are currently used in their 
programs. Overall, the other methods of assessment that were used 
fell into 10 categories and included information gathered from: (a) 
parental involvement, using interviews, questionnaires or consumer 
satisfaction measures, (b) observation, (c) teacher or staff input 
from questionnaires, meetings, or interviews, (d) individual ized 
educational programs (lEP) development and reviews, (e) referrals or 
records including medical histories, (f ) continuous monitoring and 
data collection, (g) videotapes of child's interaction, (h) home 
visits, (i) Family Needs Assessment, and (j) evaluations conducted by 
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an outside party. Nearly all of the respondents listed parental 
involvement (94.3%) and observational methods {83*0%) as other means 
of assessment. With the exception of outside eval uaticns, which were 
exclusively used for the purpose of program evaluation, all of the 
methods were listed as being used for at least three purposes. A 
summary of other methods listed as being used by HCEEP demonstration 
projects across purposes is presented in Table 5. 

The average number of tests and the average number of other 
methods listed for each purpose are summarized in Table 6. For each 
purpose, it appears that both tests and other methods are used to help 
make decisions. However, results suggest that tests are more 
frequently used than other methods when making decisions about 
classification and placement, instructional planning, and pupil 
evaluation. This is most evident in the category of classification 
and placement (t = 5.1, p < .01). 

Technical Considerations 

Data on the commercial tests used for assessment by five or more 
projects (those listed in Table 2) were judged in terms of their 
technical adequacy. In order to be judged technically adequate, the 
test's norms, validity and reliability must all meet specified 
criteria. The criteria used in this study were compiled from several 
sources including the Standards for Educational ^nd Psychological 
Tests (American Psychological Association, 1985), Assessment in 
Special and Remedial Education (Salvia & Ysseldyke, 1985), and an 
article by Mardell-Czudnowski and Lessen (1982). The criteria used 
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Table 5 

Percentages of HCEEP Model Demonstration Projects Using 
Other Methods of Assessment for Different Purposes 



Percent of 
Programs 
Method Using Method^ 


Purpose*^ 


Screening 


PI acement 


Instructional 
Programming 


Pupil 
Evaluation 


Program 
Evaluation 


Parental 

Involvement 94.3 


56 0 


IAD 


. u 


14 . U 


56.0 


Observation 83.0 


59. 1 


oo . u 


. 0 




20. 5 


Teacher/Staff Input 37.7 


25.0 


15.0 


40.0 


10.0 


45.0 


lEP Review 30.2 


0.0 


0.0 


18.8 


75.0 


56.3 


Referral 20.8 
Information/Records 


79 7 


C.I .6 


1 Q 0 


0.0 


0.0 


Graphing/Data 20.8 
Coll ection 


0.0 


0.0 


36.4 


72.7 


27.3 


Videotapes 13.2 


0.0 


28.6 


57.1 


28.6 


57.1 


Home Visits 7.5 


25.0 


25.0 


0.0 


50.0 


50.0 


Family Needs 5.7 
Assessment 


33.3 


0.0 


66.7 


33.3 


33.3 


Outside Evaluations 5.7 


0.0 
[ 


0.0 


0.0 


0.0 


100.0 



^Percentages reflect numbers of HCEEP projects mentioning method of assessment 

•^Percentages reflect numbers of HCEEP projects using the method for each purpose based 
only on those listing the method 
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Table 6 

Average Number of Tests and Other Methods Listed for Each Purpose 



Purpose 


Tests (X) 


Other Methods (Y) 


t 


Screening 


1.46 


1.68 


-1.0 


CI assi f ication/Pl acement 


2.48 


.78 


5.1* 


Instructional Planning 


2.83 


1.31 


3.9* 


Pupil Evaluation 


2.57 


1.20 


4.5* 


Program Evaluation 


1.87 


1.46 


1.4 



*Significant at .01 or less 
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for evaluating each test's norms, reliability and validity are 
presented in Table 7. All of the tests except the Portage Guide to 
Early Education and the Uzgiris Hunt Ordinal Scales of Development 
were analyzed in light of their use as an instrument to help make 
classification and placement decisions, which requires the most 
stringent rel iabil ity coefficients. Tests that were specifically 
described as criterion referenced (although they may have given some 
age guidelines) were not analyzed with respect to the technical 
adequacy of their norms because they presumably are not used to make 
normative comparisons. Only information contained in the most current 
test's manual was used to analyze each instrument's technical 
adequacy. 

The tests used by the model demonstration projects are evaluated 
in Table 8. The evaluation indicated that of the 19 instruments used 
by five or more programs, only three were technically adequate on all 
three dimensions (using most stringent reliability criteria as 
dictated by the purpose for which the test is used by the model 
program). 

Discussion 

This study investigated current assessment practices of HCEEP 
demonstration projects. Many studies have documented the educational 
contributions and effectiveness of HCEEP demonstration projects, but 
none have comprehensively examined the assessment practices actually 
used in these model programs. 

The selection of tests that are used in HCEEP demonstration 
models reportedly is based largely on recommendations by other 
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Table 7 

Criteria for Evaluating Technical Adequacy of Tests 



Norms 



Reliability 



Validity 



1. Norms should be available 1. 
In manual or In an 
accompanying technical 
publication. 

2. Norms should be clearly 
defined and describe the 
populations especially 
for comparative purposes. 
Such defining 2. 
characteristics of 
populations should 
include: the age(s), 
grade level , gender, 
geographic regions 
used, race, and *3. 
handicapping conditions 
found within the 
norming population. 

4. 

3. The norm-sampling 
method should be well 
defined. 

4. The norm-sampling 
should not have been 
based upon convenience 
or readily available 
populations. 5. 

5. Revised tests should 
provide norm comparisons 
for all forms. It should 
be noted that criterion- 
referenced tests do not 
need to present norming 
information (Popham, 6. 

ig7i). 

6. One hundred should be 
the minimum number of 7. 
persons in any norm 
sample per age or 
grade (Salvia & 
Ysseldyke, 1985). 

7. In assessing individuals 
with handicapping 
conditions, the test 
user should use either 
regular or special norms, 
depending cn the purpose 
of the testing. 

8. The test's norms should 
not be older than 15 
years (Ysseldyke & 
Algozzine, 1984). 



The manual should present 
evidence of reliability. 
Although the manual 
should contain the 
reports on reliability, 
additional sources such 
as technical reports, 
should be consulted. 

. Reliability coefficients 
as well as standard 
errors of measurement 
should be presented in 
a tabular format. 

Reliability procedures 
and samples should be 
described. 

At least cne type of 
reliability used should 
be stated (i.e., 
test-retest, alternate 
form. Internal 
inconsistency, split- 
half, interrater 
reliabiMty). 

For making decisions 
regarding individuals, 
reliability coefficients 
must be greater than or 
equal to .go (e.g.. 
Instructional planning 
and place.Tient. 

For screening decisions, 
reliability coefficients 
of .80 are acceptable. 

For administrative 
purposes and group 
decisions, reliability 
of .60 is acceptable. 



1. Validity should be reported 
in the manual or in an 
accompanying technical 

publ ication. 

2. Evidence of at least one 
type of validity should 

be presented for the major 
types of Inferences for 
which the use of a test 
is recommended (i.e., 
criterion-related; con- 
current or predictive; 
content; construct). 

3. For content validity, the 
manual should define the 
content area(s). Tests 
that are based on content 
validity should update 
content in revised forms. 

4. For con'itruct validity, 
the manual should clearly 
define the ability or 
aptitude measured. For 
tests for which there is 

a time limit, the manual 
should state how speed 
affects scores. 

5. For both types of 
criterion-related validity 
(a) the criteria should be 
defined; (b) validity 
should be reported; 

(c) samples should be 
completely described; 

(d) correlation 
coefficients with other 
tests should be reported; 
and (e) for predictive 
validity, a statement 
concerning the length of 
time for which predictions 
can be made should be 
included. 



*In this analysis, reliability studies with sample sizes less than 25 were not considered. 
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Table 8 

Technical Adequacy of Devices Used by 
Five or More HCEEP Demonstration Projects 
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Portage (1976) 
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^Evidence of content validity only is based on information contained 

in test manuals only. 
Ratings in table are: + technically adequate, - technically 

inadequate, * manual not available. 
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professionals and technical considerations. Written material from 
textbooks, journals or publisher catalogs does not appear to be of 
utmost importance when tests are selected. However, inservice 
training workshops appear to have some impact on whether a test is 
selected for use. This information has implications for those who 
develop and market assessment instruments, suggesting that the chances 
of a test being selected are increased if it is technically adequate 
and has been used and recommended by other professionals in the field. 

The continued use of tests in HCEEP model demonstration projects 
is most strongly influenced by the information gathered from the 
test's results. This makes intuitive sense and one would hope that 
HCEEP model programs are using tests that provided useful information. 
The next most important factors influencing the continued use of tests 
were the tests' norms, validity and reliability. Again, it appears 
that HCEEP projects are adhering to guidelines which document the 
importance of using technically adequate devices (American 
Psychological Association, 1985; Salvia & Ysseldyke, 1985; Ysseldyke & 
Algozzine, 1984). However, examination of the tests that are being 
used reveals that, in many cases, HCEEP Model Demonstration projects 
are using devices that are technically inadequate. 

Over 100 tests were listed by the 54 projects surveyed. Nineteen 
tests were used by five or more programs and only one test was used by 
over half of the responding programs. All of the tests listed were 
analyzed according to the purpose for which they were used. Nearly 
all of the 19 tests listed were used for several purposes. However, 
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some tests were used more exclusively for some purposes than for 
others. For example, the Denver Developmental Screening Test was 
generally used for screening. For making placement and classification 
decisions, the Kaufman Assessment Battery for Children, McCarthy 
Scales of Children's Abilities and Peabody Picture Vocabulary 
Test-Revised were primarily used. Five tests (HELP, Portage Guide, 
UPAS, Uzgiris-Hunt , and Brigance) were used particularly for 
instructional planning and one was particularly cited for program 
evaluation (UPAS). 

Results suggested that no single test was used for pupil 
evaluation more exclusively than others. Perhaps the demonstration 
programs relied more heavily on informal data-based measurement 
systems to monitor individual student progress. Such measures usually 
are tied to the curriculum, simple to administer, reliable, valid and 
sensitive to small fluctuations in student performance (Ysseldyke, 
Thurlow, Graden, Wesson, Algozzine, & Deno, 1983). Yet, some model 
programs continue to use tests that yield IQs for measuring pupil 
progress — tests that are clearly inappropriate (Salvia & Ysseldyke, 
1985). Examination of the data indicated that nearly 21% of the 
programs listed continuous graphing and data collection as another 
method of assessment. Of those who listed this measurement technique, 
72% used it for pupil evaluation. However, tests were still listed 
more frequently than other informal methods as a means of evaluating 
or monitoring student progress. 

Of the 19 tests most frequently cited, and analyzed according to 
the purpose for which they were used (using the most stringent 
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reliability criteria, r > .90), only three had technically adequate 
norms, validity and reliability. Although results suggested that 
tests are selected and continue to be used ba?ed largely on their 
technical adequacy, many of the tests used are not technically 
adequate. Practitioners might explain this discrepancy by pointing 
out that these are the only tests available. Nevertheless, using a 
technically inadequate device cannot be justified or excused, because 
of the important decisions that are made based on the data gathered 
from such devices. 

The criteria used for analyzing a test's technical adequacy in 
this study were determined by guidelines provided by the APA (1985); 
and research conducted by Ysseldyke and Algozzine (1984); and 
Mardell-Czudnowski and Lessen (1982). Although a test user can be 
more confident of a test's technical adquacy based on these criteria, 
it is essential for test users to examine qualitative aspects of 
norms, reliablity and validity in addition to quantitative indexes. 
For example, sample size in reliability studies must be considered, 
research studies investigating validity that do not appear in the 
manual perhaps should be considered, the test's accuracy in making 
correct decisions might be examined, and the test's purpose as 
designated by its authors must be examined. Ultimately, it is the 
test user's responsibility to determine the value of a test based on 
documented research. 

Fortunately, it looks as though decisions that are made about 
children in HCEEP programs are based on more than one test. In 
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addition, other methods are used for the assessment of young children. 
Nearly all of the projects used information gathered from parents and 
observations in their assessment process. Othe^" sources of 
information included teacher/staff input* lEP review, continuous 
graphing and data collection, and home visits. These techniques 
become especially important when the inadt^uacy of the tests used is 
considered. Although the use of these methods appears to be secondary 
and supplementary to the use of tests in making decisions, the HCEEP 
model demonstration programs may be shifting toward a comprehensive 
process of data gathering for making educational decisions. 

HCEEP projects have developed many products to assist in the 
assessment of young children. As model early childhood programs, they 
are charged with the responsibility of using and developing sound 
assessment practices. Perhaps among the 100 tests that were listed by 
the 54 projects surveyed, some of the less frequently used project 
developed tests hold promise in providing useful information and being 
technically adequate. If so, it is important to disseminate this 
information to wider audiences who are in need of sound devices on 
which to base early childhood decisions. In addition* the awareness 
and use of other methods that are used to contribute to a 
comprehensive process of data gathering are of critical importance in 
making decisions about young handicapped children. 
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HCEEP ASSESSMENT DEVICE SURVEY 



1. Center Information 

a Years of funding: 19 lo 19 

b. Age range of children served: to 

c. Number of children served in 1983-8-4: _ 

d. Target populalion(s) served: 



2. Tests are selected in various ways. We are Interested In knowing 
what factors influence the selection of tests you use In your program. 
Please check the two factors that most often apply. 

Inservice training workshops 

Recommended by other professionals 

Use of rests in Print or editions of Nenta! Measurements Yearbook{.%\x(&) 

Publlslier's catalog or odvertlsemenls describing test 

Availability or access to test 

. Technical considerations (norms, reliability, validity) 

Textbook 

(name) 

Journal article 

(name) . . 

Cost of test 

Other (please list) . — 



3. Tests are used for various reasons. How important are the factors 
listed below in influencing the use of the most commonly used tests 
in your program? Please rate each in order of importance (1 » most 
important, 2 » next most important, . . . 9 = least important). 

Easy to administer 

Common use of test In the past 

Appropriate norming population 

Adequate rellabllitu as specified by APA standards ( 1972) 

Adequate val Idlty as specified by APA standards ( ! 972) 

Information Qalhered from lest results 

Recommended by other professionals 

Cost of test 2 J 

Favorably described by test market 
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4. Tests are administered for a variety of purposes. For each purpose 
defined below, list tlio tests and any other methods that your program 
currently uses. 



PURPOSE 


TESTS USED (e.g.. tfcCarthy 
Scales of Children s Abiliiics) 


OTHER METHODS (e.g.. psrent 
fnlcr vie odser vslfon) 


SCREEff/ffG 
To Identify sludenls 
who ore sumclcnlly 
oiiTercnl from 
others similar 
in age lhal Ihey 
require special 
allenlion or 
assessment. 






ct A sstncA rioff/ 

PLACEtlENT 

To Identify sludenls 
who ore eligible 
for special 
education services. 






tNSTRUCTiOfiAL 
PLANNING 
To asslsl slaffln 
plaaning cducallonal 
programs (deciding 
what to teach and 
how to teach) for 
individuals. 






PUP/t 

EVALUAliON 

To monitor individual 
progress. 






PPOGRAtI 
FVAlUAnON 
To evaluate the 
erfcctlvcncss of 
the educational 
program 

ERIC 
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